Contents

1. Overview

2. Introduction to SECR

3. Loading the software

4. Data import

5. Making a mask

6. Model fitting

7. Plotting results

8. Model selection

9. Menu options

10. Reporting bugs


1. Overview

The gibbonsecr software package uses Spatially Explicit Capture–Recapture (SECR) methods to estimate the density of gibbon populations from acoustic survey data. This manual begins with a brief introduction to the theory behind SECR and then describes the main components of the software.


2. Introduction to SECR

2.1 Basic setup

2.2 Estimated bearings

2.3 Estimated distances

2.4 Detection functions

2.5 Detection surface

2.6 Effective Sampling Area

(Back to contents)


Over the past decade SECR has become an increasingly popular tool for wildlife population assessment and has been used to analyse survey data for a wide range of animal groups. The main advantage it has over traditional capture-recapture techniques is that it allows direct estimation of population density rather than abundance. Traditional capture-recapture methods can only provide density estimates through the use of separate estimates (or assumptions) about the size of the sampled area. In SECR however, density is estimated from the survey data by using information contained in the pattern of the recapture data (relative to the locations of the detectors) to make inferences about the spatial location of animals. By extracting spatial information in this way SECR can provide direct estimates of density without requiring the exact locations of the detected animals to be known in advance.

2.1 Basic setup

The basic data collection setup for an SECR analysis consists of a spatial array of detectors. Detectors come in a variety of different forms, including traps which physically detain the animals, and proximity detectors which do not. Using proximity detectors it is possible for an animal to be detected at more than one detector (i.e. recaptured) during a single sampling occasion.

The plot below shows a hypothetical array of proximity detectors, with red squares representing detections of the same animal (or the same group in the case of gibbon surveys) and black squares representing no detections.




The pattern of the detections (i.e. the pattern of the recapture data) gives us information about the true location of the animal/group; intuitively we would guess that it is probably near the cluster of red detectors. The plot below shows a set of probability contours for this unknown location, given the recapture data.




In the case of acoustic gibbon surveys the listening posts can be treated as proximity detectors and the same logic can be applied to infer the unknown locations of the detected groups. However, the design shown in the figure above would obviously be impractical for gibbon surveys. The next figure shows probability contours for a more realistic array of listening posts where a group has been detected at two of the posts.




(Back to top of section)
(Back to contents)

2.2 Estimated bearings

As you probably guessed from the previous section, using fewer detectors results in less information on the unknown locations. Fortunately however, SECR also allows supplementary information on group location to be included in the analysis – for example in the form of estimated bearings to the detected animals/groups. The next figure illustrates how taking account of information contained in the estimated bearings can provide better quality information on animal/group locations.




Using estimated bearings in this way can lead to density estimates that are less biased and more precise than using recapture data alone. Since the precision of bearing estimates is usually unknown, SECR methods estimate it from the data. This requires the choice of a bearing error distribution. The figure below shows two common choices of distribution for modelling bearing errors – the von Mises and the wrapped Cauchy – where the colour of the lines indicates the value of the precision parameter (SECR estimates the value of this parameter from the survey data).



The wrapped Cauchy is likely to perform better when there are a small number of large errors but when most of the estimates are close to the truth. The von Mises is likely to perform better there are fewer large errors.

(Back to top of section)
(Back to contents)

2.3 Estimated distances

TODO

(Back to top of section)
(Back to contents)

2.4 Detection functions

Another key feature of SECR is that the probability of detecting a (calling) gibbon group at a given location is modelled as a function of distance from the detector. This function – referred to as the detection function – is typically assumed to belong to one of two main types of function: the half normal or the hazard rate. The specific shape of the detection function depends on the value of its parameters, which need to be estimated from the survey data. The half normal has two parameters: g0 and sigma. The g0 parameter gives the probability at zero distance and the sigma parameter controls the width of the function. The hazard rate has three parameters: g0, sigma and z. The z parameter controls the shape of the ‘shoulder’ and adds a greater degree of flexibility. The figure below illustrates the shape of these detection functions for a range of parameter values.



(Back to top of section)
(Back to contents)

2.5 Detection surface

The association of a detection function with each detector allows the overall probability of detection by at least one detector during the survey to be calculated for any given animal/group location. The figure below illustrates this idea of overall detection probability using a heat map of a detection surface.

The region near the centre of the surface is close to the detector array and has the highest detection probability. E.g. in the figure above, an animal/group near to the detectors will almost certainly be detected. This probability declines as distance from the detectors increases.

(Back to top of section)
(Back to contents)

2.6 Effective sampling area

The shape of the detection surface is related to the size of the effective sampling area. Since the region close to the detectors has a very high detection probability, most animals/groups within this region will be detected and this region will therefore be almost perfectly sampled. However, regions where the detection probability is less than 1 will not be completely sampled as some animal/groups in these areas will be missed. The figure below illustrates this idea for a series of arbitrary detection surfaces.

The first plot in this figure shows a flat surface where the detection probability is 0.5 everywhere. In this scenario every animal/group has a 50% chance of being detected. If the area covered by the surface was 10km2, then the effective sampling area would be 10km2 x 0.5 = 5km2. Using this detection process we would expect to detect the same number of animals/groups as we would if we perfectly sampled an area of 5km2. In the second plot in the figure above half of the area is sampled perfectly and the other half is not sampled at all, so this has the same effective sampling area as the first plot. The third plot has a detection gradient and isn’t as intuitive to interpret. However, the way we calculate the effective survey area is to calculate the volume under the detection surface. The third plot has the same volume as the other two, so it has the same effective area.


(Back to top of section)
(Back to contents)


3. Loading the software

3.1 Launch from R
— 3.1.1 Install R
— 3.1.2 Install prerequisite R packages
— 3.1.3 Install the gibbonsecr package
— 3.1.4 Launch the user interface

3.2 Launch from a desktop icon
— 3.2.1 Download the files
— 3.2.2 Make a shortcut icon

(Back to contents)


There are currently two ways to install the gibbonsecr software: (i) by installing a statistical software package called R from which you can launch the user interface; or (ii) by downloading a pre-compiled version of R and adding a shortcut icon for the user interface to the desktop.

Note to Mac users: Before you begin the installation process you need make sure you have the XQuartz software (also known as X11) on your machine. Users of OS X 10.5 (Leopard), 10.6 (Snow Leopard) and 10.7 (Lion) should already have this installed by default (to check, look for the X11.app application in your applications folder). Users of OS X 10.8 (Mountain Lion), 10.9 (Mavericks) and 10.10 (Yosemite) will need to install it manually.

Download XQuartz

3.1 Launch from R

3.1.1 Install R

Make sure you have the latest version of R installed.

Download R for Windows

Download R for Mac

Optionally you can also install something called RStudio which acts as an interface to R and is more user-friendly (it has syntax highlighting and auto-completion for example).

Download RStudio

3.1.2 Install prerequisite R packages

The gibbonsecr package uses some other R packages that don’t come with the default version of R, so you’ll need to install them manually by typing (or cutting and pasting) the code below into the R console.

install.packages(c("CircStats", "dplyr", "MASS", "secr", "tcltk2"),
                 dependencies = TRUE)

3.1.3 Install the gibbonsecr package

Once the prequisite packages are installed you can install the gibbonsecr package. It’s currently hosted on GitHub but you can download it and install it by running the code below.

Windows users:

install.packages("https://github.com/dkidney/gibbonsecr/raw/master/binaries/gibbonsecr_1.0.zip", 
                 repos = NULL, type = "win.binary")

Mac users:

install.packages("devtools", dependencies = TRUE)
devtools::install_url("https://github.com/dkidney/gibbonsecr/raw/master/binaries/gibbonsecr_1.0.tgz")

3.1.4 Launch the user interface

You only need to run the above steps once. Once everything is installed you can launch the user interface by opening R (or RStudio) and typing the following lines into the console.

library(gibbonsecr)
gibbonsecr_gui()

(Back to top of section)
(Back to contents)

3.2 Launch from a desktop icon

3.2.1 Download the files

TODO

3.2.2 Make a shortcut icon

TODO


(Back to top of section)
(Back to contents)


4. Data import

4.1 CSV files
— 4.1.1 Detections
— 4.1.2 Posts
— 4.1.3 Covariates

4.2 Data details

4.3 Data buttons

(Back to contents)


The first step in conducting an analysis is to import your survey data. This is done via the Data tab.

**SCREENSHOT**

4.1 CSV files

As a minimum you need to prepare a detections file and a posts file. You can also include an optional covariates file. Advice on how to structure these files is given in the sections below. All raw data files need to be in .csv format. The file paths to your data files can be entered manually into the text entry boxes in the CSV files section, or you can navigate to the file path using the ... button.

4.1.1 Detections

The detections file contains a record of each detection, with one row per detection. For example, if group 1 was recorded at listening posts A and B then this would count as 2 detections. This file needs to have the following columns:

  • array – ID for the array
  • occasion – ID for day of the survey (typically an integer between 1 and 4)
  • post – ID for the listening post
  • group – ID for the group
  • bearing – Estimated bearing

The screenshot below shows an example detections file for a one-day (i.e. single-occasion) survey.

(Back to top of section)
(Back to contents)

4.1.2 Posts

The posts file contains information on the location and usage of the listening posts. This file needs to have one row per listening post and should contain the following columns:

  • array – ID for the array
  • post – ID for the listening post
  • x – Longitude coordinate (in metric units)
  • y – Latitude coordinate (in metric units)
  • usage – Indicator showing the sampling days on which the posts were operated. E.g. if on a 3-day survey a particular post was used on day 1 and day 3 but for some reason not on day 2, you would write 101 in the usage column for that row. Each row in the usage column should contain the same number of digits.

The screenshot below shows an example posts file for a one-day survey.

(Back to top of section)
(Back to contents)

4.1.3 Covariates

The covariates file contains information on environmental and other variables associated with the survey data. This file needs to have one row per day for each listening post and should contain the following columns:

  • array – ID for the array
  • post – ID for the listening post
  • occasion – ID for day of the survey (typically an integer between 1 and 4)

These columns can all be used as covariates themselves, but any additional covariates should be added using additional columns. Use underscores _ instead of full stops for the covariate names.

The screenshot below shows an example covariates file for a one-day survey.

(Back to top of section)
(Back to contents)

4.2 Data details

Once the paths to the data files have been entered, select the relevant units from the Data details dropdown boxes for your estimated bearings data (and estimated distances data if it was collected). (Note that the current version of the software only allows Type = continuous since interval methods for bearings and distances haven’t yet been implemented.)

(Back to top of section)
(Back to contents)

4.3 Data buttons

Once the paths to your data files have been added and the data details checked, you can then press the Import button. If your data imports successfully then a summary print out should appear in the output window. You can re-print this summary at any time by pressing the Summary button.

The screenshot below shows an example of some summary output after a successful data import.

**SCREENSHOT**


(Back to top of section)
(Back to contents)


5. Making a mask

5.1 Mask size and resolution
— 5.1.1 Buffer
— 5.1.2 Spacing

5.2 SHP files
— 5.2.1 Region
— 5.2.2 Habitat

5.3 Mask buttons

(Back to contents)

The SECR model fitting procedure requires the use of a mask which is a fine grid of latitude and longitude coordinates around each array of listening posts. When an SECR model is fitted, the mask is used to provide a set of candidate locations for each detected group. It is important to use a suitable mask to avoid unreliable results.

5.1 Mask size and resolution

There are two main settings you need to consider when defining a mask – the buffer and the spacing – which you can specify in the Mask tab.

5.1.1 Buffer

The buffer defines the maximum distance between the mask points and the listening posts. It needs to be large enough so that the region it encompasses contains all plausible locations for the detected groups, but it shouldn’t be unnecessarily large. Buffer distances that are too small will lead to underestimates of the effective sampling area and overestimates of density. However, increasing the buffer distance also increases the number of mask points, which means that the models will take longer to run, so the buffer also shouldn’t be larger than it needs to be. The ideal buffer distance is the distance at which the overall detection probability drops to zero.

A good way to check whether the buffer distance is large enough is to look at the detection surface, which you can plot after fitting a model (see the section on plotting results). The detection surface plot produced by gibbonsecr is the same size as the mask, so the colour at the edge of the plot will show you the overall detection probability at the buffer distance. If the detection probability is greater than zero at the buffer distance then you should increase the buffer distance, re-fit the model and re-check the detection surface plot.

To illustrate this issue, the figure below shows a series of detection surfaces from models that were fitted using mask buffers of 1000m, 10000m and 5000m respectively.

The buffer in plot 1 looks to be too small, since the detection probability at the buffer distance is much greater than zero. In this case it is extremely likely that the true locations of some of the detected groups were actually outside the buffer zone. In plot 2 the buffer has been increased to 10000m. The detection probability at the buffer distance looks to be at zero so we would expect the density estimate to be unbiased. The density estimate in plot 2 is about 75% lower than the estimate in plot 1, which suggests that the estimate in plot 1 is a big overestimate. The buffer distance plot 3 is intermediate between the other two. The detection probability is still zero at the buffer distance, but the estimated density is very similar to plot 2, so it doesn’t look to be biased. In this case the mask in plot 3 would be preferred since (for a given resolution) it will be much quicker to fit models than the mask in plot 2 whilst still giving reliable results.

(Back to top of section)
(Back to contents)

5.1.2 Spacing

The buffer spacing is the distance between adjacent mask points. Decreasing the spacing will therefore increase the resolution and increase the total number of mask points. Smaller spacings provide a greater number of candidate locations and lead to more reliable results. However, increasing the number of mask points has a cost in terms of computing time and if the spacing is too small then models may take a very long time to run. As a general rule of thumb, try to use the smallest spacing that is practical given the speed of your computer, but try not to use spacings larger than 250m.

(Back to top of section)
(Back to contents)

5.2 SHP files

The Mask tab also allows you to upload shapefiles in order to attach spatial covariate values to each of the mask points for use in model formulas.

NB: Make sure the projection units in your .shp files are compatible with the coordinate system used in your posts data file

5.2.1 Region

TODO

5.4 Mask buttons

TODO


(Back to top of section)
(Back to contents)


6. Model fitting

6.1 Model options

6.2 Model parameters
— 6.2.1 Formulas
— 6.2.2 Fixing parameter values
— 6.2.3 Estimating g0
— 6.2.4 Estimating calling probability

6.3 Model buttons

(Back to contents)


Once you have made a mask you can move on the Model tab and start fitting some SECR models.

**SCREENSHOT**

Specifying a model is split into two steps: (i) choosing what kind of detection function and bearing error distribution you want to use, and (ii) deciding whether to fix any parameter values or model them using the available covariates. These steps are described in more detail below.

6.1 Model options

The first section in the Model tab contains dropdown boxes where you can choose between different detection functions and different distributions for the estimated bearings and distances.

Setting the bearings/distances distribution to none means that the bearings/distances data will be ignored in the analysis. Setting both bearings and distances distributions to none will result in a conventional SECR model being fitted using only the recapture data.

(Back to top of section)
(Back to contents)

6.2 Model parameters

The next section in the Model tab provides various options for refining your model. Each row in this section relates to a particular parameter in the SECR model.

Don’t worry if you forget these definitions, hovering your cursor over the row labels on the user interface will open a temporary help box to give you a reminder.

(Back to top of section)
(Back to contents)

6.2.1 Formulas

If you wish to estimate a particular parameter in your analysis then you need to make sure that the Formula entry box for that parameter is activated by clicking on the radio button on the right hand side of box. If the Formula box is activated but left blank then a single coefficient for that parameter will be estimated (i.e. an intercept-only model). If you wish to specify a formula for a particular parameter you need to click the radio button on the right hand side of the Formula entry box for that parameter using the available covariates then you need to type the names of the covariates you wish to use into the Formula box, separated by + signs. E.g. to model the sigma parameter using habitat and weather then you would type,

habitat + weather

into the Formula box for sigma.

A note to experienced R-users: As well as + you can also the * and : operators to specify formulas. You can also use gam functions s, te, ti and t2 (from the mgcv package) for numeric variables. However, the use of as.factor and as.numeric to coerce variables, and -1 to change the model contrasts, is not supported.

(Back to top of section)
(Back to contents)

6.2.2 Fixing parameter values

Sometimes you many not want or need to estimate a particular parameter, in which case you can fix its value. To do this, click on the radio button on the right hand side of the Fixed entry box and type the value of the parameter in the box.

A general point to bear in mind when fixing any parameter is that it will generally lead to a more precise density estimate (i.e. one with narrower confidence intervals). If the fixed parameter is known with a high degree of certainty then this would be a desirable effect. However, if there is uncertainty over the true value of that parameter (e.g. you may have used an estimate from a previous study) then this will not be incorporated into the SECR results and the precision of the density estimate will be overestimated (i.e. the confidence intervals generated by the software will be too narrow).

(Back to top of section)
(Back to contents)

6.2.3 Estimating g0

For one-day surveys the only option allowed by the software is to fix g0 at 1. This is because when the listening post is zero distance from a calling group the probability of detecting it is extraordinarily unlikely to be anything other than 1. (Remember that the g0 parameter gives the detection probability for a calling group at zero distance from the listening post).

For multi-day surveys however, the movement of groups between consecutive sampling days means that we have to redefine group ‘location’ as being the average location of the group. As a result, g0 needs to be reinterpreted as the probability of detecting a calling group at zero distance from the average location. In this case it is much more likely that the detection probability for a calling group whose average location is zero distance from the listening post will be less than one. This is because a group is unlikely to always be at its average location during a multi-day survey (unless it happens not to move). For multi-day survey data it is therefore a good idea to estimate g0.

(Back to top of section)
(Back to contents)

6.2.4 Estimating calling probability

For one-day surveys, pcall can’t be estimated so the only option is to provide a fixed value. By default, pcall is fixed at 1 for one-day surveys, which means that the D parameter can be interpreted as the density of calling groups, rather than the density of groups. However, the software allows you to change this value (e.g. you may have prior knowledge of the calling probability for the study species) in which case the density parameter can be reinterpreted as the density of groups. For one-day surveys, changing the pcall value will result in a direct scaling of the density estimate. For example, if you had an estimated calling group density of 5, changing the fixed value for pcall to 0.5 and re-fitting the model would result in a group density estimate of 10.

For multi-day surveys there are three possible options for dealing with the pcall parameter when analysing your data.

  1. Estimate pcall from the survey data – This requires that the survey data contain temporal recaptures – i.e. the group IDs should indicate which groups were detected on more than one survey day. The temporal recapture data needs to be reliable for this method to work. Note that the SECR software considers data from each array independently, so you only need to identify the temporal recaptures within arrays (and not between arrays).

  2. Fix pcall to a known value – E.g. using data from a previous survey. Bear in mind that fixing pcall to a value that is too low will result in a overestimate of group density. Furthermore, remember that fixing the value of pcall is likely to lead to an overestimate in the precision of your group density estimate.

  3. Fix pcall to 1 and estimate calling group density – This can be done by treat the data from each survey day independently. The only way to do this currently is to modify the array and occasion columns in your data files before you import the data. The values in the array columns should be edited so that the IDs are all unique – for example, array “1” for occasion 1 could be re-labelled “1_1”, array “1” for occasion 2 could be re-labelled “1_2”, etc. All entries in the occasion columns should be set to 1, and all entries in the usage column of the traps file should be set to “1”. By doing this the SECR software will treat each occasion independently and ignore any temporal recaptures. should be set so that the software treats the data for each array separately. Note that future versions of the software will hopefully automate this process.

(Back to top of section)
(Back to contents)

6.3 Model buttons

There are four button at the botton of the Model tab:

  1. Fit – Use this button to fit the model once you are happy with your model specification. Once your model is fitted a summary will appear in the output console.
  2. Summary – Use this button to reprint the model summary
  3. Coef – This prints a summary of the raw model coefficients
  4. Predict – This allows you to predict the value of each model parameter. If you have used covariates in model formulas a pop-up box will appear allowing you to choose specific values of covariates (e.g. you may wish to predict density for all levels of a habitat covariates)


(Back to top of section)
(Back to contents)


7. Plotting results

Once you have fitted a model you can plot the results. The Plots tab currently has five plotting options:

  1. Detection function
  2. Detection Surface
  3. Density surface
  4. Bearing error distribution
  5. Distance estimates distribution

There is currently no option for choosing covarate levels when making these plots. By default they use the intercapt coefficients for each model parameter.

The plots below were fitted to the example data set provided with the software (see Section 9.1.1).

(Back to top of section)
(Back to contents)


8. Model selection

8.1 AIC

8.2 Model plausibility


An important element of statistical modelling is choosing a preferred model from a number of candidate models. For example, you may get a slightly different density estimate when using the hazard rate detection function instead of the half normal detection function. How do you decide which model, and therefore which density estimate, should be preferred?

8.1 AIC

A common way of choosing between competing models is to use something called the AIC score. This is a number that can be calculated from the fitted model which tries to measure how well the model balances having a good fit to the data whilst not being overly complex. The AIC score can be found at the bottom of the model summary printout (which is displayed after pressing the model Summary button at the bottom of the Model tab).

When using AIC it is important to bear in mind the following points:

(Back to top of section)
(Back to contents)

8.2 Model plausibility

Whilst AIC can be extremely useful it shouldn’t be used blindly and you should also ensure that any preferred model is also plausible. For example, model A might have a lower AIC score than model B, but if model A looks entirely unrealistic (e.g. given your knowledge of the study system) then you should discard it. For example, a fitted bearing error distribution which implied that errors as large as 180 degrees were highly probable might be ignored if such an outcome is known to be very unlikely under normal field conditions.


(Back to top of section)
(Back to contents)


10. Reporting bugs

Version 1.0 of the software is the first release version. Whilst it has been tested on a number of data sets there might still be some bugs or other things that need to be fixed. If you encounter anything that needs to be fixed, or if you have any other suggestions or feedback about the software, please email Darren at darrenkidney@yahoo.co.uk.


(Back to contents)